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THEIR IDENTITY FROM MULTIPLE MARKED ANIMALS AND ITS 
APPLICATION TO THE PETERSEN METHOD 
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Louisiana Wild Life and Fisheries Commission 
Baton Rouge, Louisiana 


and 
Robert J. Muncy 
School of Forestry 
Louisiana State University 
Baton Rouge, Lcuisiana 
Abstract 
In order to make an estimate of the size of a population of animals at a given time 
by the Petersen method, use is made of a sample of the fraction of marked animals in the 
population. However, if some of the animals originally marked lose their marks and thus 
ean not be identified in the sample, a Petersen type estimate will be biased, the magni- 
tude of the blas depending upon the proportion of animals retaining their identity, If 
an estimate can be made of the animals which have retained their identity at a given time, 
it is possible to make corrections for this bias. This report presents formulas for 
estimating the number of marked animals which have retained their identity, at- a given 
time frem multiple marked animals and shows their derivation, shows their application to 


the Petersen method, discusses the necessary cor@itions for them to apply, discusses the 


errors associated with such estimates and shows how confidence limits can be determined, 


Introduction 
In order to make an estimate of the size of a population of animals at a given time 
by the Petersen method (or what game biologists commonly refer to as the Lincon Index) 
use is made of a sample of the fraction of marked animals in the population. The sssump= 
tion is made that this sample fraction is an unbiased estimate of the fraction of marled 
animals in the population, However, if some of the animals originally marked lose their 


mark and thus can not be identified in the sample, a Petersen type estimate will be biased, 
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the magnitude of the bias depending upon the proportion of animals retaining their 
identity. If an estimate can be made of the number of animals which have retained their 
identity at a given time, it is possible to make corrections for this bias. The purpose 
of this report is to present formulas for estimating the number of marked animals which 
have retained their identity at a given time from multiple marked animals and to show 
their derivation, to show their application to the Petersen method, to discuss the 
necessary conditions for them to apply, to discuss the errors associated with such 
estimates and to show how confidence limits can be determined, Also, it is suggested 
that more studies using techniques such as multiple marking be undertaken to determine 
how serious the bias from the loss of marks is in population estimates. 

There appears to be considerable confusion relative to what is the correct method 
of estimating the number of marked animals which have retained their identity. For 
example, Iurry (1960) used a procedure which gave a valid estimate of the number of 
marked deer that retained their identity; however, Dennett and Kidd (1960) used a pro» 
cedure to estimate the number of marked animals which retained their identity which 
gave an estimate with a positive bias. It is hoped that this discussion in considering 
the essential points in population estimates in the case of multiple marking will 
illustrate what is the correct method. 

Even though the previous studies were in reference to terrestial animals the same 
procedure applies to aquatic animals, Beverton and Holt (1957) in their book "On the 
Dynamics of Exploited Fish Populations" consider this problem and they especially cover 
the case where the loss of marks from multiple marked animals are not independent, which 


is not covered in this report. 


The Petersen Method 
A number of marked animals are placed in the population and then during some succeed= 


ing interval a sample of the population is taken to estimate the fraction of marked 
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animals in the population, ‘The size of the population of animals can be estimated as 
follows: 
t = M/(x/n) (1) 
or t = ln/x 


where t = estimate of T, the size of the population at time of 
marking 


M = number of animals originally marked 


ta 
LH 


number of marked animals in the sample 
nh = number of animals in the sample 

In the notation employec in this report a capital letter usually refers to a 
population parameter and a small letter refers to a sample parameter or estimate from 
the sample, 

Formula (1) is the maximum ~ likelihood estimate of the population and it is cone 
sistent in the sense that it tends to the correct value as the sample fraction is in- 
creased, However, it is biased, and in small samples this bias can be substantial 
(Bailey, 1952). This bias 1s positive and of the order f~1 where f = E(x) = Mn/? 
(Bailey, 1952). However, in large enough samples this bias will not be serious. Bailey 
proposed a modified formula which gives an almost unbiased estimate. Thus: 

& = M(n+1)/x+1 (2) 
Bailey (1952) and others have pointed out that the estimate of the reciprocal of the pop-= 
ulations size Y = T7 is unbiased, ‘Thus: 
y = (x/n)/i (3) 
ory = x/nil 
where y = fan the estimate of the reciprocal of the population size. 
Also, according to Ricker (1958) the reciprocal has the advantage that it has the 


tendency to be distributed more nearly symmetrically about the population mean, 
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Ricker (1958) lists the necessary conditions for the application of formulas (1) 
through (3). These are: 
(1) The marked animals suffer the same mortality as the urmarked, 
(2) The marked animals are as vulnerable ta recapture as the unmarked animals. 
(3) The marked animals do not lose their mark, 
(4) The marked animals become randomly mixed with the unmarked animals or 
the subsequent sample is proportional to the number of animals present 
in different parts of the population. 


(5) All marked animals are recognized and reported on recovery. 


(6) There 4s only a negligible amount of recruitment to the catchable 
population during the time the recoveries are being made. 


The satisfaction of all of the above conditions is Important in the straight forward 
application of the Petersen method, However, this paper is concerned mainly with condition 


(3) and how to correct for it when it is not satisfied. 


Development of Sampling Model 

In almost all of the development to follow, use will be made of the concept of come 
pound probabilities of independent events. Therefore this concept will be introduced at 
the start. 

If two independent events have probabilities of occurrence of P(a,) and P(ao) respect« 
ively, then the compound probability P(aq, ao) that both events will occur together is the 
product of their separate probabilities. ‘Thus, 

P(aj» a5) = P(a,) Play). (4) 
Now if P(a,) = P(ao), then, 

Play, a>) = Pla)®, (5) 
If more than two independent events are involved then, 


P(as% Q59 ee #9 a.) = P(a,) P(a,) eee P(a,,) (6) 


(4) 


Digitized by the Internet Archive 
in 2023 with funding from 
University of Illinois Urbana-Champaign Alternates 


https://archive.org/details/estimatingnumber00lamb 


and if P(a,) = P(aj) = 6» » = P(a,,), then, 
P(Qq, Ao9 « » os a,) = Pla)”, (7) 

This can be verified in almost any introductory text dealing with probability, such 
as Goldberg (1960) or in more advanced texts such as Feller (1957) and Anderson and 
Bancroft (1952). 

We are assuming that an animal is marked with mark 1, mark 2 through vr marks and that 
the event (a), the loss of a mark, occurs independently and randomly throughcut the 
population and that the events (a,), (ao) ee es (8,), the loss of mark 1, 2, « « os Ts 
respectively, are mutually independent, 

Assuming that cur assumptions apply, then our problem is to determine the number of 
animals which lose all r marks, i.e., the number of animals which lose their identity. 
If we Imow the probabilities of losing mark 1, 2, . .« «, %, then from formulas (4) through 
(7), we can determine the probabilities of losing all r marks. Probability is sometimes 
defined as the proportion in a large number of repeated independent trials. Then, the 
number of animals expected to lose all marks would be MP (a4, Bor 0 0 op an) where II is 
the number of animals originally marked. However, the probabilities of losing a mark 
is not known, but these can be estimated from the sample. Such estimates are known as 
emperical probabilities, 

Actually, we are dealing with random variables and we are interested in the joint 
probability distribution of mark 1, 2, . « «, re Letts consider the joint distribution 
of the loss or retention of mark 1 and mark 2. This joint distribution will be a 
bivariate frequency function represented symbolically by f(m, M) where m refers to mark 
1 and ff refers to mark 2, Let event (a), the loss of a mark be represented by O and 
event b the retention of a mark be represented by 1, then their joint probability dis- 


tribution would be as follows: 
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which in this case would be the probability of losing both marks, 
of not losing both marks, which is the subregion not included in w, is 1=P(w). 
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However, we are interested in a subregion, w of W, that is, the region (i, » me 


>, £0, my) 
We 
Then the probability 


Note, 


the totals of the columns give the probability distribution of m and the totals of the 


rows give the probability distribution of m, which are designated as the marginal dise 


tributions of m and m. 


that in this case they will be binomially distributed. 


Also note the ~t 


f (ity) and the 


vy £(m, ) both eoual 1 and 


The ‘>? (iy, my) wi11 also equal 1. 
Vs 


Since m and m are independent f (my, my) = f(my) f(my) and the entry in any row and 


column is the product of £ (im, ) and £(m, ) and which will give the probability of event ii, 


and m4 occurring together, 


The function, f(m>, m,) 1s also a random variable and since 


we are interested only in whether an animal loses both marks or does not lose both marks, 


this function will be distributed binomially, i.e., 


P(q) = 1 = P(w). 


[ Piw) + P(a) |” = 1, where 


eer 


From the previous we can devise a sampling plan to estimate the number of animals 


retaining their identity. 


If we take a random sample, with replacement, of the animals 
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with mark 1, then, since we are sampling from the binomial probability distribution, an 


unbiased estimate of P(a,), the probability of losing mark 1 is 


a 
opine (8) 


where Py = the proportion of the animals in the sample losing mark 
L 


a, = the number of animals in the sample which lost mark 1 

n = the number of animals in the sample 
and the expected value of Py is 

E(p,) = E(a,/n) (9) 
Likewise, P(a5), the probability of losing mark 2 is estimated by 

vae= Ao/ve (10) 
and 

E(Po) = E(ap/n ) (11) 
Then, P(ay, ay) can be estimated by 

Py22 = PyP2 (12) 


where Py» = the estimate of the proportion of animals originally 
marked with mark 1 and 2, which lost both mark 1 and 2 


and E(Py29) = E(py) Elpo) (13) 
Therefore, Py» is an unbiased estimate of the proportion of animals losing their identity 
in an infinite population. However, this is only the most likely proportion of animals 
losing their identity in the finite population of animals we have marked (which can be 
considered as a sample from an infinite population), It is not necessarily the proportion 


losing their identity in the finite population, even if the proportion of animals losing 
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mark 1 and mark 2 was known without error. If Py is an unbiased estimate, then 1D, 95 
is an unbiased estimate of the proportion of animals retaining their identity. 
So far we have considered only the joint distribution of the losbB or retention of 


mark 1 and mark 2; however, this can be extended in a similar manner to 3 or more marks, 


Estimating Animals Which Retained Their Identity 
If a number of multiple marked animals are put into the population and then during 
some succeeding interval a sample of the population is taken then it is possible to 
estimate the number of marked animals which have retained their identity. Let the animals 
be originally marked with two marks, mark 1 and mark 2. Let's take an unbiased sample of 
the animals and assume that the loss of mark 1 1s independent of mark 2 and that the loss 
of marks occurs randomly throughout the population. In the sample some animals will be 
found to have lost mark 1 and mark 2. All animals marked with mark 1 will be an unbiased 
sample of those which were originally marked with mark 2 some of which lost mark 2 and some 
of which retained mark 2, ‘This would have to be true by definition, since the condit- 
ional probability of losing mark 1 given that mark 2 is not ety is equal to the probe 
ability of losing mark 1 and the conditional probability of losing mark 2 given that mark 
1 is not losb is equal to the probability of losing mark 2, if the loss of mark 1 and 2 
are independent of each other, ‘¥#> | Peart | nat 
Le an 
Then an estimate of the proportion of animals losing mark 2 is 
Po = 5 (14) 
where Po = estimate of the proportion of animals losing mark 2 


the number of animals in the sample which lost mark 2 


ao 


by = the number of animals in the sample with mark l 
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Likewise, 


By Ses (15) 


where Py = estimate of proportion of animals losing mark 1 
a, = the number of animals in the sample which lost mark 1 

bs = the number of animals in the sample with mark 2. 
Then Pyro ¢ eo r) an estimate of the proportion of animals losing their identity or proe 
portion of animals losing the ith through rth tag is, 

Py22 = PyPo (16) 
and an estimate of the proportion of animals not losing their identity, is 

cal ink ee Re (17) 
Then s, an estimate of the number of animals originally marked which have retained their 
identity, is 

s = qu (18) 

So far, we have been assuming that P(a, ) # P(a,); however, if we can safely assume 
that the probability of losing mark 1 1s equal to the probability of losing mark 2, Leée, 
P(ay) = P(ao), 1t will be better to pool the sampling data and make one estimate of P{a) 
the probability that any mark will be lost. Then, p, the estimate of the proportion of 


marks lost is 


Sa 


; 1 
ars (19) 


An equivalent of formula (19), which probably is the form that oan be used the easiest 


with field data is 


Api 
ede-c 


where c = the number of animals in the sample which lost one mark 


p (20) 


ad = number of marked animals in the sample 


Then Prep * yp” (21) 
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and an estimate of the number of animals originally marked which have retained their 
identity can be made by using formulas (11) and (12). 

Now let the animals be originally marked with three marks, mari 1, mark 2, and 
mark 3. Let's take an unbiased sample of the animals and assume that the loss of mark 
1, 2 and 3 are independent of each other, Then an animal marked with mark 2 and/or mark 3 
will be an unbiased sample of those which were originally marked with mark 1. An estimate 


of the proportion of animals losing mark 1 is 


a 

Py = i (22) 
bo, 2 

where a, = the number of animals in the sample losing mark 1 


ba, 3 the number of animals in the sample with mark 2 
and/or mark 3 


Likewise, 
a 
Po ™> c (23) 
1, 3 
a 
and Ps 5 - 2 (24) 
De he 
where a, = the number of animals in the sample losing mark 2 
az 5 the number of animals in the sample losing mark 3 
by, ie! the number of animals in the sample with mark 1 and/or 
mark 3 
by, 2 = the number of animals 4n the sample with mark 1 and/or 
marie 2 
Then, 
Pl 2,3 = PyP oP, (25) 


If we can safely assume that P(a,) = P(@o) = P(az), then p the estimate of the proportion 


of marks lost is 


p a Sod Faiths Aig alhnd. Woneioes (26) 
bo,3 + Byte + bi ,2 
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and 
p = p? (27) 
1,2,3 
It should be noted that it 4s possible to assume that P(a,) = P(a,) # P(a,)e Then 
we should pool the data for the estimate of the proportion losing mark 1 and 2 into one 
sample, Also it is possible to use more than three marks, «. which can be handled in 


a manner similar to the above examples. 
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Discussion of Problems Concerned with Estimating 
Number of Animals Which Have Retained Their Identity 


In taking our subsequent sample of the marks and estimating the number 
of animals which have retained their identity there are possible sources of 
bias which could enter into our estimate even though our sample was random, 
If the sample is taken over a short period of time where the increase in the 
number of marks being lost is negligible, then any possible bias would be 
negligible. However, if the increase in the number of marks being lost is 
substantial, it is possible that the estimate of the number of animals which 
have lost their identity will be biased. During the sampling period we are 
estimating the average number of marks retained, and, we are making a linear 
interpolation, However, if during the sampling period the graph of the nun- 
ber of marks retained plotted against time departs any appreciable amount 
from linearity, then our estimate would be biased. However, if the.graph is 
linear or essentially linear, there should be no appreciable bias. 

As mentioned previously, if we can safely assume that P(a,) = P(ao) or 
P(a,) = P(ay) = ee = P(a,,) then obviously it will be better to pool the 
sampling data and make one estimate of P(a). However, the question arises, 
when is it safe to assume that the probabilities are equal? One could test 
to see if the observed proportions of marks lost differ significantly by 
(1) a chi-square test of independence (2) a binomial test or (3) if the sample 
size is small more accurately from statistical tables for use with binomial 
samples such as presented by ee en Cieeey. Then if the differences are 
significant, we would decide against pooling. However, in tests of this 
nature, one can make the statement that the differences are non-significant 
or are of such a magnitude as to be inconsequential, but it is impossible to 
prove that the differences are zero. It may be that our tests are not 
accurate enough to detect the differences. Also our tests are for Type I 
errors (Type I error is to reject the null hypothesis when it is true). 


However, in a test of this nature it would appear that a Type IZ error (Type 
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II error is to accept the null hypothesis when it is false) is much more 
serious than a Type I error. Therefore, in order to minimize the probability 
of committing a Type II error, we believe it would be wise to accept differ- 
ences as being significant at a relatively high probability (say, Cokes 
probability of .30 which is the probability of committing a Type I error). 
This is possible, inasmuch as when the sample size is fixed, choosing as 
significant a larger probability of a Type I error will 

decrease the probability of making a Type Il 
error. Generally, we would recommend that one assume that P(a,) f P(ap) or 
P(a,) # Pla.) # woe # P(a,.), unless there is strong evidence to the contrary. 
Estimates of Q will not be biased by assuming that the probabilities are 
different, but our estimate of @ could be biased by assuming that they are 
equal when in fact they are not equal. 

It should be pointed out that in order to estimate Q it is not necessary 
to multiple mark all animals, i.e., if we can identify the multiple marked 
animals in the sample. Also we could estimate Q from a separate experiment, 
during a different time period, a different area, or even by pooling data 
from various sources. Obviously there is inherent danger in such a procedure. 
One should be certain that the estimate of @ applies to the period for which 
we are estimating the number of animals retaining their identity. 

This paper has the objective of showing how to estimate the number of 
animals which have retained their identity so that a correction can be made 
for this in estimating population size. However, we would like to point out 
certain aspects which should be considered even if an estimate of the number 
of animals retaining their identity is not made. With some animals, e.2.,; 
deer, the cost of capturing the animals for marking is high. Therefore, 
once such an animal is marked and released into the population we want to be 
relatively certain that we will be able to identify this animal again. 


Therefore, there will be obvious advantages in multiple marking such an 
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animal. Usually the cost of marking and handling the animal is negligible 
compared to the cost of capture. Also, the cost of applying any additional 
marks on the animal is negligible, Previous experiments may indicate that 

a mark placed on such an animal will after a given period of time, say a 
year, have a probability of being lost equal to .10. Then we would expect 
that on the Lees 10 percent of our animals would lose their identity 

while 90 percent would retain their identity after a year's time if only one 
mark was placed on the animals originally. However, if two marks were placed 
on each animal and each mark had a probability of being lost of .10, then 
Fane = .10° = ,01. Thus, we would expect that on the average only 1 percent 
of the animals would lose their identity while 99 percent would retain their 
identity. Now, let's place 3 marks on the animal each with a probability of 
being lost of «103 thenp, » 3 = .10? = ,001. Thus we would expect that on 
the average only .l percent of the animals would lose their identity while 
99.9 percent would retain their identity. This is a considerable reduction 
from when only one mark was placed on the animal. The multiple marking would 
be another approach to overcoming the problems of mark loss and may be more 
fruitful than attempts toward obtaining superior marks, 

With animals which are small and can be caught in large numbers it would 
not be as important to make sure that the animals have such a high probability 
of retaining their identity, i.e., if we can evaluate this loss and correct 
for it in our population estimate. In fact, the multiple marking of such 
animals might have a disadvantage. If marking and handling cause mortality, 
it is possible that the use of multiple marks will cause additional mortal- 
ity. This could bias our vopulation estimate, inasmuch as stated previously, 
one of the necessary conditions for the application of the Petersen estimate 


is that marked animals suffer the same mortality as the unmarked. 
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Estimating the Population 

Once we have determined s, the estimate of the number of animals origi- 
nally marked which have retained their identity, we are in a position to 
make a more nearly unbiased estimate of the population by the Petersen method. In 
the Petersen formulas, formulas (1), (2), and (3), M-is the number of animals 
originally marked. However some of these animals will not be available to be 
caught as marked animals in the sample because they have lost their identity. 
Therefore if we substitute s for M in the Peterson formulas we will have a 


more nearly unbiased estimate of the population. Thus, formula (1) becomes 


t = s/(x/n) 
or t= sn/x (28) 
and formula (2) becomes 
t = x(nt+1)/x+1 (29) 
and formula (3) becomes 
y = (x/n)/s 
or y =x/ns (30) 


We have been assuming that the probability of the loss of marks on an 
animal are independent, ieee, their loss is not correlated. It will be 
worthwhile to consider the consequences if they are positively correlated, 
i.e., the loss of one mark increases the probability that another mark will 
be lost. (It would appear that the possibility of them being negatively 
correlated is unlikely.) Then our estimate, 2 Se “ will have a negative 
bias while our estimate of © will have a positive bias, i.e., on the average 
q >. Then on the average s Deas and on the average our population esti- 
mate will have a positive bias, i.e., t io T. However, if we did not 
estimate s (even though s is biased) our population estimate would on the 
average have an even greater bias, i.e., on the average ty = t 9 par where 
ty is the population estimate when no estimate is made of the number of 


animals retaining their identity and t, is the population estimate where an 
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estimate is made of the number of animals retaining their identity. Thus, 


even though the losses of marks are positively correlated it will be vorth- 
while to make a biased estimate of Q, for it will always allow us to remove 


some even though not all of the bias from our estimate of T,. 
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SAMPLING ERROR 


Any estimate is subject to experimental error and it is important to make 
some statement about the probable size of such error, The size of such error 
must be evaluated, at least approximately, before any confidence can be placed 
in an estimate. In the usual Petersen type estimate there will be sampling 
error in x (or x/n) and in formulas (28), (29) and (30) there will also be 
sampling error in our estimate of s. If our assumptions are met, all of the 
quantities to be estimated will be distributed according to the binomial dis- 
tribution, if sampling is done with replacement, However, if sampling is done 
without replacement they will be distributed according to the hypergeometric 
distribution. However, in practice, it would rarely happen that so great a 
fraction of the population is sampled that the hypergeometric would differ 
appreciably from the binomial, Thus, in most instances no sensible error would 
be committed by taking the distribution to be binomial. We are also consider- 
ing that our sample size is large enough and that the mean of our estimate of 
the population size, t or y,is large enough for large sample theory to apply 
and so that t and y may be regarded as normal variables. Then the 95 percent 
confidence limits of our estimate (actually the 95.4 percent limits) can be 


approximated by 


(t, t) = t+ 2fv(e) (31) 
where t = the lower limit of the 95 percent confidence 
interval 
t = the upper limit of the 95 percent confidence 
interval 
v(t) = the sample estimate of the variance of t. 


and (y, y) can be estimated in a similar manner. 
In calculating confidence limits as outlined above, they should preferably 


be calculated for statistics whose distribution is as "normal" as possible. 
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Ricker (1958) states that in estimating the population size, the reciprocal 
of t tends to be distributed symmetrically about the mean, while t often is 
not symmetrically distributed about the mean. Therefore, he suggests that 
confidence limits first be computed for y (where y = e7}) and then inverted 
in order to obtain limits for t, 

According to the binomial theory, the large sample estimate of the vari- 


ance of a proportion is: 


v(p) = p“(1-p)/n (32) 
where v(p) = the sample estimate of the variance of p’ 
p’ = the sample estimate of the proportion 
n = sample size 


In many of the estimates discussed in this report there will be experimen- 
tal error in two or more components and it will be necessary to compound these 
errors. Therefore we will present formulas for the large sample approximation 
of the variance of a product and the variance of a quotient. If Z = XY, then 


the variance of Z is 


v(z) = Y°V(xX) + x2V(y) (33) 
where V(Z) = the variance of Z 
V(X) = the variance of X 


V(Y) = the variance of Y. 


Now if Z = Y/X, then the variance of Z is 


= y" 1 
V(Z) Tae V(X) ataer) ViCY a: (34) 
X X 


In sampling, we substitute estimates from the sample into the above formulas 
and thus we can arrive at an estimate of the variance of the product or quo- 


tient. The development of these formulas are shown in appendix (1). 
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To illustrate the setting of confidence limits, let's illustrate the pro- 
cedure for computing confidence limits for formula (1) and formula (3) assuming 
that none of the marked animals lose their identity and M is known without error. 


The variance of x/n from formula (32) is 


v(x/n) = (x/n)(l-x/n)/n. (35) 
Then from formula (34) the variance of t is 
v(t) = uw? v(x/n) + 1 v(M) . 
(x/n)* (x/n)? 
However, since v(M) is zero the term 1 v(M) vanishes and 
(xfa)- 
2 
E M 
SAS w— v(x/n) (36) 
(x/n) 


The variance of y from formula (34) is 


2 
v(y) = —(/n) vq) + _ v(x/n) 
M 
M 


However, since v(M) is zero, the term 
2 
(x/n) 
i v(M) 
M 


vanishes and 1 
WAGON Lod crept AES 6} a (37) 
M 


Bailey (1952) gives the following expression for estimating the variance of t: 


v(t) = M7n(n-x)/x? ; (38) 
Ricker (1958) gives the following expression for the large sample variance of y: 
v(y) = __x(n-x) (39) 
203 
Mn . 
However, these expressions (38) and (39) will give the same results as formu- 


las (36) and (37). This can be illustrated by the following example taken 
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from Ricker (1958, p. 85, example 3A). Thus, M = 109, n = 177 and x = 57. 
By formulas (1) and (3), t = 109/(75/177) = 339 and y = (57/177)/109 = .00295. 


The variance of x/n is: 


v(x/n) = (.322)(.678)/177 = .001233 
and according to formula (36) 


v(t) 2 


109 
4 


i] 


001233 
0322 
1362.677  . 


According to formula (38) 
v(t) = (109)” (177) (177-57) /57° 
= 1362.646 
which gives the same result as formula (36) allowing for errors due to rounding. 


According to formula (37) 


v(y) = ls = (001233 
109 


000,000,010, 38 


and according to formula (39) 


v(y) = __57(177-57) 
(109) 2(177)° 


000,000,010, 38 


which is the same result as obtained from formula (37). 


The approximate 95 percent confidence limits of t and y are 


(t, t) = 339 + 2\| 1362.677 


= 265-413 


ee rr 


.09295 + 2 000,000,010, 38 


il 


and (y, y) 
= ,002,747 - .003,153 
Inverting the limits of y gives limits of t and t equal to 317 — 364 which are 


not symmetrical about the mean. 
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Now let's consider the case where some of the marked animals lose their 
identity and we are using formulas (28) and (30) to estimate the population. 
The variance of (x/n) will be estimated as done previously (formula 35). How- 
ever, now there is also experimstttsii error in our estimate of s. First we 
must estimate the variance of p or P,P.; etc. Let's assume that B-(a)) = P(a,) 
and make a pooled estimate, p. Then from formula (32) 

v(p) = p(l-p)/b) + by . (40) 

If we assume that P. (a) # P:(ay), 
v(P,) ap? Pp, (1-p,)/b, (41) 
and v(p,) = po(l-p,)/b, (42) 


Then our estimate of the variance of Pi 9 from formula (33), assuming 
> 


that we made a pooled estimate p from formula (33), is 


p-v(p) + p-v(p) (43) 


2 [ p’w(p) | 


and if we did not make a pooled estimate, p, the variance formula is 


v(P) 2) 


or v(P; 2) 


ney 42 2 
v(Py 9) a 2) v(P,) + Py v(P.) (44) 
Inasmuch as P} 9 is distributed binomially 
b ] 
v(q) = v(P) 4) (45) 
and from formula (33) the variance of s can be estimated as follows 
v(s) = M*v(q) (46) 
Then, our estimates of the variance of t and y would be as follows 
v(t}emek ee7 it eysnyiare O16 aedy(a) (47) 
(x/n)* (x/n)? 
and v(y) = (x/n)? v(s) + 1 v(x/n) (48) 
4 
8 8 
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By use of the basic formulas presented, v(t) and v(y) can be determined when 
more than two marks are applied. 

Let's consider an example. We will use the same basic data, modified 
somewhat, which we borrowed from Ricker previously. Let's assume that 110 
animals were marked originally with two marks and each mark had a proba- 
bility of being lost of .10. Then Pi 2 = (.10)(.10) = .01, q = .99, and 
s = .99(110) = 109. (If we had applied only one mark to the animal and p 
equaled .10, we would have had to mark 121 animals to expect s to equal 109.) 


Out of our sample of 57 animals we found b, + b, = 103 and a, + a, = 10. 


1 *4 2 1 
Then p = 10/103 or approximately .10 and 
v(p) = .10(1-.10)/103 
= .0087 
Petey) aa 
and v(P, 4) = 2 | 10 (0087) | 
= ,000,174 
and v(q) = .000,174. 
Then, 9 
v(s) = (110) .000174 
= 2.1054 
As previously, v(x/n) = .001233, then 
v(t) = 109% 001233 + 1 2.1054 
.322° £3227 
= 1382.982 
z 
and vfy) =__.322 2.1054 + 1 : 001233 
109 109 
= ,000,000,010,532,5 


The approximate 95 percent confidence limits of t and y are 


(t, t) = 339 + 2 1382.982 


265 - 413 
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and = (y, y) = .00295 + 2 \{.000,000,010,5 


= .002,745 = .003,155 


Inverting the limits of y gives limits of (t, t) equal to 317-364. Actually 
the variance of s contributed very little to the variance ot t and y in the 
example presented here, which is shown by the fact that there was no measurable 
change in the confidence limits from that presented previously: however, this 


may not be necessarily true in all cases. 
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DISCUSSION 


What has been presented in this report is only a preliminary considera- 
tion of the problem of mark loss, We would like to encourage others to con- 
sider this problem in more detail, especially the mathematical aspects of the 
problem and the determination of confidence limits. 

The formulas for the determination of confidence limits and the variance 
of the estimates are only large sample approximations, However, the question 
arises, how large does the sample have to be for the approximation to hold and 
for the estimates to be approximately normally distributed? We don't have this 
answer, and surely this is worthy of further study. Also, which confidence 
limits, (t, t) or the inverted limits of (yy) are best? In the example, the 
inverted limits of (y, y) gave the shortest limits, however an approximation 
cannot be chosen solely because it yields the shortest limits. This is also 
worthy of further study. 

It should be noted that the formulas for the determination of variance are 
generalized. This was done on purpose so that they could be used with differ- 
ent sampling plans and assumptions. It should be emphasized that the formulas 
for the estimation of the variance of the estimate are only large sample approx- 
imations and in small samples they could under estimate the variance. Formula 
(34) for the variance of a quotient can be put into the form which Finney (1952) 
refers to as a "naive result''. He suggests that for the use of formula (34) X 
must be at least 9 times its standard error for the setting of the 95 percent 
confidence limits, Finney presents formulas for the determination of fidu- 
cial limits of a quotient based on the t distribution and the Behrens - Fisher 
distribution which would be appropriate for small samples. For the assumptions, 
we have used in this report, fiducial limits based on the Behrens-Fisher distri- 


bution would be the most applicable. 
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There was one source of variation which we did not consider in the deter- 
mination of the variance of the estimate, As pointed out previously, even if 
the proportion of animals losing mark 1 and mark 2 was known without error, 

Pi 2 is only the most likely proportion of animals losing their identity in 
the finite population of animals we have marked. However, if a relatively 

large number of animals are marked, i.e., if Mis large, we believe that the 
contribution of this source of variation to the variance of the estimate will 


be negligible and can be ignored. 


ACKNOWLEDGMENTS 


We are indebted to Mr. Robert Murry and Mr. Lawrence Soileau, Louisiana 
Wild Life and Fisheries Commission for introducing us to the problem of mark 
loss, We have gained much from the suggestions and discussions with the 
above mentioned persons and with Mr. Scott Overton of the Cooperative Statistical 
Project, North Carolina State University; however, inasmuch as we have not always 
chosen to take their sound advice the responsibility for any errors rests solely 


with us. 


(25) 


Be Ree 
ssazeb 913 é2 Sdbisdo ica bt ow ‘ana ay 
aL ak yo Si We ae 
ii ceve  inuciverg duo baaniog es Voametsee 86 302 g = : es of 
ren ‘ cea at a yy va 


prow Fvodsie pret real enw 4 hae) be ‘ iso ggtect bios =m i de 


at yaisanebi tied? paren ye eal te: aotsroqo79, xteatty, doom a | 0: 
yleettaton & 22 ovewel boolean ovedt ov efeatar Yo ahs si mtg 


oe 


edt aed3 evetisd ow cogyel o? i $2 oted bose ot efootas tox 


- 


Illa osankves’ sfx to eonsitay od3 O23 roboabeaw to so1u5e eres ge > Bau . 


ey 


-hoxongt od ‘mo hae | a, dtgt 


ERD ERINOREN "1 7 

si bn 3 balvnct suede vat O93. baadabat . 
; ” : P ve =! 

Jyem Yo maldoxg oft 03 au gatouboxat x0 aodeates0 gpiredegt = 2 

o2 datw enoclebuont bas one i3eogave ots most dous asap ve ia 

i838 ovitnteqoo) ‘arts Yo ovsane $2092 va i3iw bess eccaxsg | 


sq dounesal savewod ‘yenterowiad oaas2 sotto? 
ot 


40% ent ieonoqtes ona fa bauos steds aaa 


anaisive! ,useflod sone1s 


7 


>. 


Isoiae 
ayewla ton oval ow 


ylolou asvex tease woe 


LITERATURE CITED 


Anderson, R..L. and T. A, Bancroft 1952. Statistical theory in research. 
McGraw-Hill Book Co., Inc., New York: xix + 399 pp. 


Bailey, N. J. J. 1952. Improvements in the interpretation of recapture 
data. J. Animal Ecology, 21: 120-127. 


Beverton, R. J. H. and S. J. Holt 1957. On the dynamics of exploited fish 
populations. Ministry of Agriculture, Fisheries and Food, Fishery In- 
vestigations, Series II, Vol, xix. Her Majesty's Stationery Office, 
London, England: 533 pp. 


Dennett, D. and J. B. Kidd 1960. An analysis by tam returns of three years 
controlled hunting. Presented at the Fourteenth Annual Conf., S.E. Assoc. 
of Game and Fish Commissioners, 14 pp. (processed). 


Feller, William 1957. An introduction to probability theory and its applica- 
tions, Vol. 1, 2nd ed., John Wiley and Sons, Inc., New York: xv + 461 pp. 


Finney, D. J. 1952. Statistical method in biological assay. Charles Griffin 
and Company, Limited, London, England: xix + 661 pp. 


Goldberg, Samuel 1960. Probability an introduction, Prentice-Hall, Inc., 
Englewood Cliffs, N. J.: xiv + 322 pp. 


Mainland, D., L. Herrera and M. I. Sutcliffe 1956. Statistical tables for use 
with binomial samples - contingency tests, confidence limits, and sample 
size estimates. Dept. of Medical Statistics, New York University College 
of Medicine, New York: xix + 83 pp. 


Murry, R. E. 1960. Supplemental computations from recovery of marked deer on 
Red Dirt - 1959. In Annual Progress Report, Louisiana Wild Life and 
Fisheries Commission, Pittman-Robertson Project W=-29-9 (Mimeo.). 


Ricker, W. E. 1958, Handbook of computations for biological statistics of 
fish populations, Fish, Res. Board of Canada, Ottawa, Bull. 119:300 pp. 


(26) 


ws + Ges 
.tiotseeot tk yrosds Inokgatie3® .S20l sloxoasd .A — ey fl 
iad aq CCE + xix vizoF welt, ¢-0at 4209 LL eHwa3d) 


o & 


iter aw As " 
stuigs29% 30 motistorqzresal es nt etnomevos gal seer ote oe 
-WSL-O8E £8 ,ygotood fontah .% «®t 


jolt bediofqxe to solmeayb of3 a0 . Teel sto .t .2@ bes «HL n 
anT yasdielt ,boot bus eolredelt ,osusivotagé fo yxjeliin .snottaluqe 
.92f220 yroooltss3é a'yjesteM 108 «xix ,fo¥ , IT setre2? ,ano jngt ‘3 
, sg C22 thaslgad .aobam 

a 
erase seta to enmier 853 yd oteyians oA Ode, bbtd .d .t baa .@ 439 


,s0aeA .2.2 ,.1800 IsuahA danset1v0% ed3 3s 
.(boasesosg) 99 dl ,etecolaeianod dealt bas 9 2 


dedoag 3 nokgouboz2nt aA .%e@l malt W¢ 


ensifgqqa a3k dae yuoods yails fost 
~ be bat ,f .fo¥ 


qq 18a + vx wiser ‘wot ..onT ,enoe bus yoliw edot 


sa Inotgofotd ak bodjem Ieotsetoo2a .S2@L ls :¥ 
qq {80 + atx rbasiged ~nobaol ,bovimit .vanqzo9 bi 
| 7 

_pottouborsat an qatitdadord .00et loums? 43m 


. oni ~ilat-solsaort 
99 Sse + vix r.L | A SES 2 i) 208 +2 


esy yor eofdnd Isotsetssz2 .dcel oitiiowwe .I .M ban szetz0H of g-0 @ 
siquse bon ,aatatl egnobtigoo ,233a92 qonegnizaos © solquea isimoald &: 
ovsliod yiesevia Atot wot ,eoiseties2 {Instbol to .tqed -sosanijes & 

aq C8 + xix rtroY wot «oe 


nhitisd soluedd «Yee 


J 


no vob boslism 20 yr9vo0aT mort eno landuqmos Iszoometqqu2 .08@f .8 i 
boo otkd bItW sastelvol ,Js0gei eastgoTd fsunaA nl .020l ~ 3130 ba 
motes iano . 


. (cami) @-@S-W jootort noosredoi~aaas tid 


}o nobtatince Isotgolold 402 enoltstuqno> to doodbaaH .82@f .8 a 
.aq OOF :CL! ,fivd <swas20 ,sbansd to braod .ssA sett -snolssiugqog me 


a, 


APPENDIX I, 


VARIANCE OF FUNCTION OF INDEPENDENT VARIATES 
Let Z be a function of two variates, X and Y, 1.e., Z = F(X,Y¥). Then suppose two 
values of X and Y are taken which deviate from their means by the small amounts 2: X 
and 4X Y, respectively. Then the deviation of Z\ 2 of the corresponding value of Z 


from its mean is 


Ar F2 vrs St Ay (a) 


From this we can determine the relations between the variance of Z, X and Y. Then 
V(Z), the variance of the resulting Z about its mean, will be >; (de z)°/N. If we square 


both sides of equation (a) we get 


Bun FONG ea) 2 F2\/2 2 
LXV Z +4) axe +b Bay OY +2(52\(g2) ie ea ae | 


Now, > (aA x)eAt and > (A a) are the variances of X and Y, respectively, 1.¢€., 


V(X) and V(Y), respectively. Then, 
pre oes 
an <£ (Ax) BES Avy \> Azfez Sfiaxay) 
V(Z)= -i¢ 1 <= (= yy) pf fe Sata a 
( ae X N y / N (3 x } AY N (b 


However, since X and Y are independent their covariance will equal zero, then 


Se [ta X) ( 43 ¥)}] /N will equal zero. Thus 


2(4 ad a an SH Ce ae A vd will vanish 
ge Xx 
and V(z) -(54) V(X) (33 V(Z) 


Now, if Z = xv, then (@ 2/2 X) = Y and (Ce Z/p y) = xX ana 


v(z) = v= v(x) + x* v(y) (c) 
Now, if Z = Y/X, then (2 Zio X) = “y/x* 


and (? 2/HAY) = 1 and 
v(2) “i vox) + 25 viv) (a) 


Formula (¢) determines the variance of a product while formula (d) determines the variance 


of a quetsent. . Bhese are large sample approximations. 
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